Combining Efficient XML Compression with Query Processing
نویسندگان
چکیده
This paper describes a new XML compression scheme that offers both high compression ratios and short query response time. Its core is a fully reversible transform featuring substitution of every word in an XML document using a semi-dynamic dictionary, effective encoding of dictionary indices, as well as numbers, dates and times found in the document, and grouping data within the same structural context in individual containers. The results of conducted tests show that the proposed scheme attains compression ratios rivaling the best available algorithms, and fast compression, decompression, and query processing.
منابع مشابه
Compressing and Filtering XML Streams
Information technology is widely adopting the use of XML for information exchange. As messaging standards migrate to XML, there is growing concern for the magnitude of messages compared to binary formatted messages. XML compression can help mitigate the risk of exceeding the capacity of current communication resources. However, it is critical that compression technologies do not hinder XML quer...
متن کاملMQX: Multi-Query Processing Engine for Compressed XML Data
In this demonstration, we present an XML query engine MQX, which is developed for processing multiple subscribed XPath queries over compressed XML documents. MQX is equipped with efficient mechanisms for query rewriting, query organization and query optimization. To our knowledge, this is the first prototype to address the problem of efficiently processing multiple XML queries in a co-operative...
متن کاملSpiderX: Fast XML Exploration System
Keyword search in XML has gained popularity as it enables users to easily access XML data without the need of learning query languages and studying complex data schemas. In XML keyword search, query semantics is based on the concept of Lowest Common Ancestor (LCA), e.g., SLCA and ELCA. However, LCA-based search methods depend heavily on hierarchical structures of XML data, which may result in m...
متن کاملIndexing and Querying Semistructured Data Views of Relational Database
The most promising and dominant data format for data processing and representing on the Internet is the Semistructured data form termed XML. XML data has no fixed schema; it evolved and is self describing which results in management difficulties compared to, for example relational data. XML queries differ from relational queries in that the former are expressed as path expressions. The efficien...
متن کاملBSBC: Towards a Succinct Data Format for XML Streams
XML data compression is an important feature in XML data exchange, particularly when the data size may cause bottlenecks or when bandwidth and energy consumption limitations require reducing the amount of the exchanged XML data. However, applications based on XML data streams also require efficient path query processing on the structure of compressed XML data streams. We present a succinct repr...
متن کامل